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Abstract: Given two distributions F and G on the nonnegative integers we propose an algorithm to 
construct in- and out-degree sequences from samples of i.i.d. observations from F and G, respectively, 
that with high probability will be graphical, that is, from which a simple directed graph can be drawn. 
We then analyze a directed version of the configuration model and show that, provided that F and 
G have finite variance, the probability of obtaining a simple graph is bounded away from zero as the 
number of nodes grows. We show that conditional on the resulting graph being simple, the in- and 
out-degree distributions arc (approximately) F and G for large size graphs. Moreover, when the degree 
distributions have only finite mean we show that the elimination of self-loops and multiple edges does 
not significantly change the degree distributions in the resulting simple graph. 
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1. Introduction 

In order to study complex systems such as the World Wide Web (WWW) we propose a model for generating a 
simple directed random graph with prescribed degree distributions. The ability to match degree distributions 
to real graphs is perhaps the first characteristic one would desire from a model, and although several models 
that accomplish this for undirected graphs have been proposed in the recent literature [I EE, [01, not 
much has been done for the directed case. In the WWW example that motivates this work, vertices represent 
webpages and the edges represent the links between them. Empirical studies (e.g., @, suggest that both 
the in-degree and out-degree, number of links pointing to a page and the number of outbound links of a page, 
respectively, follow a power-law distribution, a characteristic often referred to as the scale-free property. 

The model we propose in this paper is closely related to the work in Q for undirected graphs, where given 
a probability distribution F, the goal is to provide an algorithm to generate a simple random graph whose 
degree distribution is approximately F. Two of the models presented in Q, as well as the model in pij ). 



are in turn related to the well-known configuration model [y, [25|, where vertices are given stubs or half- 
edges according to a degree sequence {d{\ and these stubs are then randomly paired to form edges. To 
obtain a prescribed degree distribution, the degree sequence {d{\ is chosen as i.i.d. random variables having 
distribution F. This method allows great flexibility in terms of the generality of i 7 , which is very important 
in the applications we have in mind. The most general of the results presented here require only that the 
degree distributions have finite (l + e)th moment, and are therefore applicable to a great variety of examples, 
including the WWW. 

For a directed random graph there are two distributions that need to be chosen, the in-degree and out- 
degree distributions, denoted respectively F = {fk ■ k > 0} and G = {gk ■ k > 0}. The in-degree of a node 
corresponds to the number of edges pointing to it, while the out-degree is the number of edges pointing 



out. To follow the ideas from [8|, [24[, we propose to draw the in-degree and out-degree sequences as i.i.d. 
observations from distributions F and G. Unlike the undirected case where the only main problem with 
this approach is that the sum of the degrees might not be even, which is necessary to draw an undirected 
graph, in the directed case the corresponding condition is that the sum of the in-degrees and the sum of 
the out-degrees be the same. Since the probability that two i.i.d. sequences will have the same sum, even if 

1 
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their means are equal, converges to zero as the number of nodes grows to infinity, the first part of the paper 
focuses on how to construct valid degree sequences without significantly destroying their i.i.d. properties. 
Once we have valid degree sequences the problem is how to obtain a simple graph, since the random pairing 
may produce self-loops and multiple edges in the same direction. This problem is addressed in two ways, the 
first of which consists in showing sufficient conditions under which the probability of generating a simple 
graph through random pairing is strictly positive, which in turn suggests repeating the pairing process until 
a simple graph is obtained. The second approach is to simply erase the self-loops and multiple edges of the 
resulting graph. In both cases, one must show that the degree distributions in the final simple graph remain 

(n) 

essentially unchanged. In particular, if we let ff, be the probability that a randomly chosen node from a 

(n) 

graph of size n has in-degree k, and let g k be the corresponding probability for the out-degrec, then we 
will show that, 

/fe n) ->/fe and 5 [ n) -> g k , 
as n — > oo. We also prove a similar result for the empirical distributions. 

The question of whether a given pair of in- and out-degree sequences ({mi), jrfjl) is graphical, i.e., from 



which it is possible to draw a simple directed graph, has been recently studied in [13l ll 7) , where algorithms 



to realize such graphs have also been analyzed. Random directed graphs with arbitrary degree distributions 



have been studied in [21| via generating functions, which can be used to formalize concepts such as "in- 
components" and "out-components" as well as to estimate their average size. Models of growing networks 
that can be calibrated to mimic the power-law behavior of the WWW have been analyzed using statistical 
physics techniques in (Til . [Tif . The approach followed in this paper focuses on one hand on the generation 
of in- and out-degree sequences that are close to being i.i.d. and that are graphical with high probability, 
and on the other hand on providing conditions under which a simple graph can be obtained through random 
pairing. The directed configuration model with (close to) i.i.d. degree sequences, although not a growing 
network model, has the advantage of being analytically tractable and easy to simulate. 

The rest of the paper is organized as follows. In Section [2] we introduce a model to construct in- and out- 
degree sequences that are very close to being two independent sequences of i.i.d. random variables having 
distributions F and G, respectively, but whose sums are the same; in the same spirit as the results in [lj we 
also show that the suggested method produces with high probability a graphical pair of degree sequences. 
In Subsection 13.11 we prove sufficient conditions under which the probability that the directed configuration 
model will produce a simple graph will be bounded away from zero, and show that conditional on the resulting 
graph being simple, the degree sequences have asymptotically the correct distributions. In Subsection 13 . 21 we 
show that under very mild conditions, the process of simply erasing self-loops and multiple edges results in 
a graph whose degree distributions are still asymptotically F and G. 



2. Graphs and degree sequences 



As mentioned in the introduction, the goal of this paper is to provide an algorithm for generating a random 
directed graph with n nodes with the property that its in-degrees and out-degrees have some prespecified 
distributions F and G, respectively. Moreover, we would like the resulting graph to be simple, that is, it 
should not contain self-loops or multiple edges in the same direction. The two models that we propose are 
based on the so-called configuration or pairing model, which produces a random undirected graph from a 
degree sequence {d\, di, ■ ■ ■ , d n }. In [1, 0] the prescribed degree distribution is obtained by drawing the 
degree sequence {di} as i.i.d. random variables from that distribution. More details about the configuration 
model can be found in Section [31 

Following the same idea of using a sequence of i.i.d. random variables to generate the degree sequence of an 
undirected graph, the natural extension to the directed case would be to draw two i.i.d. sequences from given 
distributions F and G. We note that in the undirected setting the two main problems with this approach 
are: 1) that the sum of the degrees may be odd, in which case it is impossible to draw a graph, and 2) 
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that there may not exist a simple graph having the prescribed degree sequence. The first problem is easily 
fixed by either sampling the i.i.d. sequence until its sum is even (which will happen with probability 1/2 
asymptotically), or simply adding one to the last random number in the sequence. The second problem, 
although related to the verification of graphicality criteria (e.g., the Erdds-Gallai criterion [Hj]), turns out 
to be negligible as the number of nodes goes to infinity, as the work in [lj shows. For directed graphs a 
graphicality criterion also exists, and the second problem turns out to be negligible for large graphs just as 
in the undirected case. Nonetheless, the equivalent of the first problem is now that the potential in-degree 
and out-degree sequences must have the same sum, which is considerably harder to fix. Before proceeding 
with the formulation of our proposed algorithm we give some basic definitions which will be used throughout 
the paper. 

Definition 2.1. We denote by G(V,E) a directed graph on n nodes or vertices, V = {vi,V2, . . . ,v n }, 
connected via the set of directed edges E. 

Definition 2.2. We say that G(V,E) is simple if any pair of nodes are connected by at most one edge in 
each direction, and if there are no edges in between a node and itself. 

Definition 2.3. The in-degree uii , respectively, out-degree di , of node Vi £ V is the total number of edges from 
other nodes to Vi, respectively, from Vi to other nodes. The pair of sequences (m, d) = ({mi, 7712, . . . , m n }, 
{d\, d2, . . . , d n }) of nonnegative integers is called a bi-degree-sequence if mi and di correspond to the in-degree 
and out-degree, respectively, of node Vi. 

Definition 2.4. A bi-degree-sequence (m,d) is said to be graphical if there exists a simple directed graph 
G(V, E) on the set of nodes V such that the in-degree and out-degree sequences together form (m,d). In this 
case we say that G realizes the bi-degree-sequence. 

In view of these definitions our goal is to generate the sequences {mj and {di} from i.i.d. samples of 
given distributions F — {fk ■ k > 0} and G = {gu ■ k > 0}, respectively. Both F and G are assumed to 
be probability distributions with support on the nonnegative integers with a finite common mean /1. Note 
that although the Strong Law of Large Numbers (SLLN) guarantees that if we simply sample i.i.d. random 
variables {71, . . . , 7„} from F and, independently, i.i.d. random variables {£1, . . . , £ n } from G, then 



One potential idea to fix the problem is to sample one of the two sequences, say the in-degrees, as i.i.d. 
observations {71, . . . , 7„} from F and then sample the second sequence from the conditional distribution G 
given that its sum is T n = X)"=i Ti- This approach has the major drawback that this conditional distribution 
may be ill-behaved, in the sense that the probability of the conditioning event, the sum being equal to 
r„, converges to zero in most cases. It follows that we need a different mechanism to sample the degree 
sequences. The precise algorithm we propose is described below; we focus on first sampling two independent 
i.i.d. sequences and then add in- or out-degrees as needed to match their sums. 

The following definition will be needed throughout the rest of the paper. 

Definition 2.5. We say that a function L(-) is slowly varying at infinity if lim^oo L(tx)/L(x) = 1 for 
all fixed t > 0. A distribution function F is said to be regularly varying with index a > 0, F € TL-a, if 
F(x) = 1 — F(x) = x~ a L(x) with L(-) slowly varying. 

P 

We will also use the notation => to denote convergence in distribution, — > to denote convergence in proba- 
bility, and N = {1,2,3,...} to refer to the positive integers. 




it is also true that in general 
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2.1. The Algorithm 

We assume that the target degree distributions F and G have support on the nonnegative integers and have 
common mean /j > 0. Moreover, suppose that there exist slowly varying functions Lp(-) and Lq{-) such that 

F(x) = ^2f k <x- a L F (x) and G(x) = £ g k < x~ L G (x), (2.1) 

k>x k>x 

for all x > 0, where a, f3 > 1. 

We refer the reader to 0] for all the properties of slowly varying functions that will be used in the proofs. 
However, we do point out here that the tail conditions in (|2.1[) ensure that F has finite moments of order s 
for all < s < a, and G has finite moments of order s for all < s < f3. The constant 

K = min{l - a -1 , 1 - /T 1 , 1/2}, 

will play an important role throughout the paper. The algorithm is given below. 

1. Fix < 8q < k. 

2. Sample an i.i.d. sequence {71, . . . , 7„} from distribution F; let r„ = Y^i=i Vi- 

3. Sample an i.i.d. sequence {£1, . . . , £„} from distribution G; let S„ = $2 i=1 £»• 

4. Define A„ = T n — S„. If |A„| < 71 1 - K + ,5 o p roceec i to step 5; otherwise repeat from step 2. 

5. Choose randomly |A n | nodes {ii, *2> ■ ■ • ,*|A„|} without remplacement and let 



where 



Ni = rji + Ti, Di = £i + xu i=l, 2, ...,n, 



fl if A n > and % e ia, • • • >*A„}, , 
Xj = < and 
otherwise, 



1 if A„ < and i e {h, h, . . . , «|a„|}, 
otherwise. 



Remark 2.6. (i) This algorithm constructs a hi- degree- sequence (N,D) having the property that L n = 
J^ILi Ni = fa) Note that we have used the capital letters Ni and to denote the in-degree 

and out-degree, respectively, of node i, as opposed to using the notation mi and di from Definition ] 2. 4\ we 
do this to emphasize the randomness of the bi- degree- sequence itself. (Hi) Clearly, neither {N±, N n } nor 
{D\, . . . , D n } are i.i.d. sequences, nor are they independent of each other, but we will show in the next section 
that asymptotically as n grows to infinity they have the same joint distribution as {{ji}, {£i})- (iv) We will 
also show that the condition in step 4 has probability converging to one. (v) Note that we always choose to 
add degrees, rather than fixing one sequence and always adjust the other one, to avoid having problems with 
nodes with in- or out- degree zero. 



2.2. Asymptotic behavior of the degree sequence 



We now provide some results about the asymptotic behavior of the bi-degree-sequence obtained from the 
algorithm wc propose. The first thing we need to prove is that the algorithm will always end in finite time, 
and the only step where we need to be careful is in Step 4, since it may not be obvious that we can always 
draw two independent i.i.d. sequences satisfying |A n | < reasonable amount of time. The first 

lemma we give establishes that this is indeed the case by showing that the probability of satisfying condition 
|A„| < n 1 ~ K+s " converges to one as the size of the graph grows. All the proofs in this section can be found 
in Subsection ST] 
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Lemma 2.7. Define V n = {|A„| < n 1 -^ 5 "}, then 

lim P(V n ) = 1. 



Since the sums of the in-degrees and out-degrees are the same, we can always draw a graph, but this is not 
enough to guarantee that we can draw a simple graph. In other words, we need to determine with what 
probability will the the bi-degrec-sequcncc (N, D) be graphical, and to do this we first need a appropriate 
criterion, e.g., a directed version of the Erdbs-Gallai criterion for undirected graphs. The following result 
(Corollary 1 on p. 110 in 0]) gives necessary and sufficient conditions for a bi-degree-sequence to be graphical; 
the original statement is for more general p-graphs, where up to p parallel edges in the same direction are 
allowed. The notation \A\ denotes the cardinality of set A. 

Theorem 2.8. Given a set of n vertices V = {vi, . . . ,v„}, having bi-degree-sequence (m, d) = 
({mi, . . . , m n }, {di, . . . , d n }), a necessary and sufficient condition for (m, d) to be graphical is 

n n 

a ) mi = di, and 

i=l i=l 
n 

b) min{(i;, \A — {vi}\} > ?n ? : for any A C V. 

i=l ViGA 

We now state a result that shows that for large n, the bi-degree-sequence (N, D) constructed in Subsection 
12.11 is with high probability graphical. Related results for undirected graphs can be found in [l[, which 
includes the case when the degree distribution has infinite mean. 

Theorem 2.9. For the bi-degree-sequence (N,D) constructed in Section \2.1\ we have 

lim P((N,D) is graphical) = 1. 

n— >cc 



The second property of (N, D) that we want to show is that despite the fact that the sequences {N^} and 
{Di} are no longer independent nor individually i.i.d., they are still asymptotically so as the number of 
vertices n goes to infinity. The intuition behind this result is that the number of degrees that need to be 
added to one of the i.i.d. sequences {7^} or {^} to match their sum is small compared to n, and therefore the 
sequences {Ni} and {Di} are almost i.i.d. and independent of each other. This feature makes the bi-degree- 
sequence (N, D) we propose an approximate equivalent of the i.i.d. degree sequence considered in [l], Q 24| 
for undirected graphs. 

Theorem 2.10. The bi-degree-sequence (N,D) constructed in Subsection \2.1\ satisfies that for any fixed 

(JVfc, . . . , N ir ,D h ,. . . , D im ) => (71, . . . ,7 r ,£i, • • • ,6) 

as n — > 00, where {7^} and are independent sequences of i.i.d. random variables having distributions F 
and G, respectively. 

To end this section, we give a result that establishes regularity conditions of the bi-degrec-sequence (N, D) 
which will be important in the sequel. 

Proposition 2.11. The bi-degree-sequence (N,D) constructed in Subsection \2.1\ satisfies 

1 " 

- HN k = i,D k = j) A f i9j , for all i,j eNU {0}, 
fe=i 

n 1 n 1 n 

-Y^Ni A £[71], -Vi),A£[a], and - ViV^ A £[716], 

i—l i—1 i=l 



N. Chen and M. Olv era- Cravioto/ 'Directed Random Graphs 



6 



— > oo, and provided E[jf + < oo, 

-, n 1 n 

I£jV?A^], and I^Ai?^], 

71 ' * 11 



as n —> oo. 



3. The configuration model 



In the previous section we introduced a model for the generation of a bi-degree-sequence (N, D) that is 
close to being a pair of independent sequences of i.i.d. random variables, but yet has the property of being 
graphical with probability close to one as the size of the graph goes to infinity. We now turn our attention 
to the problem of obtaining a realization of such sequence, in particular, of drawing a simple graph having 
(N,D) as its bi-degree-sequence. 

The approach that we follow is a directed version of the configuration model. The configuration, or pairing 
model, was introduced in (fjj and [25[, although earlier related ideas based on symmetric matrices with {0, 1} 
entries go back to the early 70 's; see 0, for a survey of the history as well as additional references. The 
configuration model is based on the following idea: given a degree sequence d = {di, . . . , d n }, to each node Vi, 
1 < i < n, assign di stubs or half-edges, and then pair half-edges to form an edge in the graph by randomly 
selecting with equal probability from the remaining set of unpaired half-edges. This procedure results in a 
multigraph on n nodes having d as its degree sequence, where the term multigraph refers to the possibility 
of self-loops and multiple edges. Although this algorithm does not produce a multigraph uniformly chosen at 
random from the set of all multigraphs having degree sequence d, a simple graph uniformly chosen at random 
can be obtained by choosing a pairing uniformly at random and discarding the outcome if it has self-loops 
or multiple edges |26[ . The question that becomes important then is to estimate the probability with which 
the pairing model will produce a simple graph. For the undirected graph setting we have described, such 
results were given in 0, 0, H3, 22, 2^] for regular d- graphs (graphs where each node has exactly degree d), 
and in 3 20, 23 [ for general graphical degree sequences. 



From the previous discussion, it should be clear that it is important to determine conditions under which 
the probability of obtaining a simple graph in the pairing model is bounded away from zero as n — > oo. 
Such conditions are essentially bounds on the rate of growth of the maximum (minimum) degree and/or the 
existence of certain limits (see, e.g., [l8, 2(| 23|). The set of conditions given below is taken from 23 1, and 



we include it here as a reference for the directed version discussed in this paper. 
Condition 3.1. Given a degree sequence d = {e^, . . . , d n }, let be the degree of a randomly chosen node, 



I ™ 

P(Z)'"1 = k) = - V l(di = k) 

II L- ' 



n 

i=l 



a) Weak convergence. There exists a finite random variable D taking values on the positive integers such 
that 

D [n] D, rw oo. 



b) Convergence of the first moment. 

c) Convergence of the second moment. 



lim E[D [n] ] =E\D). 

n—too 

lim JS[(flH ) 2 ] = E[D 2 
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Remark 3.2. It is straightforward to verify that if the degree sequence is chosen as an i.i.d. sample 
{D\ 1 . . . ,D„} from some distribution F on the positive integers having finite first moment, then parts (a) 
and (b) of Condition \3.1\ are satisfied, and if F has finite second moment then also part (c) is satisfied; the 
adjustment made to ensure that the sum of the degrees is even, if needed, can be shown to be negligible. 

Condition 13.11 guarantees that the probability of obtaining a simple graph in the pairing model is bounded 
away from zero (see, e.g., (23[), in which case we can obtain a uniformly simple realization of the (graphical) 
degree sequence {di} by repeating the random pairing until a simple graph is obtained. When part (c) 
of Condition 13.11 fails, then an alternative is to simply erase the self-loops and multiple edges. These two 
approaches give rise to the repeated an erased configuration models, respectively. 

Having given a brief description of the configuration model for undirected graphs, we will now discuss how to 
adapt it to draw directed graphs. The idea is basically the same, given a bi-degree-sequence (m, d), to each 
node Vi assign mi inbound half-edges and di outbound half-edges; then, proceed to match inbound half-edges 
to outbound half-edges to form directed edges. To be more precise, for each unpaired inbound half-edge of 
node Vi choose randomly from all the available unpaired outbound half-edges, and if the selected outbound 
half-edge belongs to node, say, Vj, then add a directed edge from Vj to Vi to the graph; proceed in this way 
until all unpaired inbound half-edges are matched. The following result shows that conditional on the graph 
being simple, it is uniformly chosen among all simple directed graphs having bi-degree-sequence (m, d). All 
the proofs of Section [3] can be found in Subsection 14.21 

Proposition 3.3. Given a graphical bi-degree-sequence (m, d), generate a directed graph according to the 
directed configuration model. Then, conditional on the obtained graph being simple, it is uniformly distributed 
among all simple directed graphs having bi-degree-sequence (m, d). 

The question is now under what conditions will the probability of obtaining a simple graph be bounded 
away from zero as the number of nodes, n, goes to infinity. When this probability is bounded away from 
zero we can repeat the random pairing until we draw a simple graph: the repeated model; otherwise, we can 
always erase the self-loops and multiple edges in the same direction to obtain a simple graph: the erased 
model. These two models are discussed in more detail in the following two subsections, where we also provide 
sufficient conditions under which the the probability of obtaining a simple graph will be bounded away from 
zero. 

We end this section by mentioning that another important line of problems related to the drawing of simple 
graphs (directed or undirected) is the development of efficient simulation algorithms, see for example the 
recent work in [f| using importance sampling techniques for drawing a simple graph with prescribed degree 
sequence {di}; similar ideas should also be applicable to the directed model. 

3.1. Repeated Directed Configuration Model 

In this section we analyze the directed configuration model using the bi-degree-sequence (N, D) constructed 
in Subsection 12.11 In order to do so we will first need to establish sufficient conditions under which the 
probability that the directed configuration model produces a simple graph is bounded away from zero as 
the number of nodes goes to infinity. Since this property does not directly depend on the specific bi-degree- 
sequence (N,D), we will prove the result for general bi-degree-sequences (m,d) satisfying an analogue of 
Condition 13.11 As one may expect, we will require the existence of certain limits related to the (joint) 
distribution of the in-degree and out-degree of a randomly chosen node. Also, since the sequences {m,} and 
{di} need to have the same sum, we prefer to consider a sequence of bi-degree-sequences, i.e., {(m„, d n )}„ e N 
where (m„, d„) = ({m„i, . . . , m nn }, {d n \, . . . , d nn }), since otherwise the equal sum constraint would greatly 
restrict the type of sequences we can use (e.g., = di for all i € N). The corresponding version of 
Condition 13.11 is given below. 
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Condition 3.4. Given a sequence of bi- degree- sequences {(m„, d„)}„ e pj satisfying 

n n 

^2 m ni = ^ d ni f or al1 n ) 

i=l i=l 

let (N^ n \ £)["]) denote the in-degree and out-degree of a randomly chosen node, i.e., 

n 

P((jVM )£ >[»]) = = _ V l(m nfc = i,d nk = j). 

n ' 

k=l 

a) Weak convergence. There exist finite random variables 7 and^ taking values on the nonnegative integers 
and satisfying £[7] = E[£\ > such that 

(NW,DW) => (7,0, n->oo. 

b) Convergence of the first moments. 

lira E[N [n] ] = £[7] and lim E[D [n] ] = E[£]. 

n— too n— too 

c) Convergence of the covariance. 

lim E[N [n] D [n] ] =E[yS]. 

n— >oo 

d) Convergence of the second moments. 

lim E[{N^) 2 } = E[j 2 } and lim E[(D™ f] = E[f}. 

n— too ro— >-oo 

We now state a result that says that the number of self-loops and the number of multiple edges produced 
by the random pairing converge jointly, asn-> 00, to a pair of independent Poisson random variables. As a 
corollary we obtain that the probability of the resulting graph being simple converges to a positive number, 
and is therefore bounded away from zero. The proof is an adaptation of the proof of Proposition 7.9 in [23j . 

Consider the multigraph obtained through the directed configuration model from the bi-degree-sequence 
(m„,d„), and let S n be the number of self- loops and M n be the number of multiple edges in the same 
direction, that is, if there are k > 2 (directed) edges from node Vi to node Vj, they contribute (k — 1) to M n . 

Proposition 3.5. (Poisson limit of self -loops and multiple edges) If {(m„, d„)} n6 N satisfies Condition \3.J\ 
with E[j\ = E[£] = fi > 0, then 

(S n ,M n )^(S,M) 

as 11 — > 00, where S and M are two independent Poisson random variables with means 

> , . £[7(7 -1)] 

Ai = and A 2 = — -5 , 

respectively. 

Since the probability of the graph being simple is P(S n = 0, M n = 0), we obtain as a consequence the 
following theorem. 



Theorem 3.6. Under the assumptions of Provosition [3751 

lim P(graph obtained from (m„,d„) is simple) = e~ Al ~ A2 > 0. 
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It is clear from Proposition 12.111 that Condition 13.41 is satisfied by the bi-degree-sequence (N,D) proposed 
in Subsection 12.11 whenever F and G have finite variance. This implies that one way of obtaining a simple 
directed graph on n nodes is by first sampling the bi-degree-sequence (N, D) according to Subsection 12. 1[ 
then checking if it is graphical, and if it is, use the directed pairing model to draw a graph, discarding any 
realizations that are not simple. Alternatively, since the probability of (N, D) being graphical converges to 
one, then one could skip the verification of graphicality and re-sample (N, D) each time the pairing needs 
to be repeated. 

The last thing we show in this section is that the degree distributions of the resulting simple graph will 
have with high probability the prescribed degree distributions F and G, as required. More specifically, if we 
let (N( r ), D^)) be the bi-degree-sequence of the final simple graph obtained through the repeated directed 
configuration model with bi-degree-sequence (N, D), then we will show that the joint distribution 

h[n) ft 3) = ± £ ^. r) =i,D^=j) i,j = 0, 1, 2, ... , 
fc=i 

converges to fogj, and the empirical distributions, 

fk =-£ l(N r) = k) and = t J2 1(D « = k) k = 0,1,2,..., 

i=l i=l 

converge in probability to fk and gk, respectively. The same result was shown in Q for the undirected case 
with i.i.d. degree sequence {Di}. 

Proposition 3.7. For the repeated directed configuration model with bi-degree-sequence (N,D), as con- 
structed in Subsection \2. 1[ we have: 

a) h^ n '(i,j) — > f igj as n — > oo , i, j = 0,1,2, ... , and 

b) for all k = 0,1,2,..., 

fk — > fk and g k y — > g k , n ->• oo. 

Remark 3.8. Note that by the continuous mapping theorem, (a) implies that the marginal distributions of 
the in- degrees and out- degrees, 

n 1 71 

/(")(,) = -Y,P(4 r) =<) and = -£P(4 r) = j), 

fe=l k=l 

converge to ft and gj , respectively. The same arguments used in the proof also give that the joint empirical 
distribution converges to fcgj in probability. 



3.2. Erased directed configuration model 

In this section we consider the erased directed configuration model, which is particularly useful when the 
probability of drawing a simple graph converges to zero as the number of nodes increases, which could 
happen, for example, when Condition 13.41 (d) fails. Given a bi-degrce-sequence (m, d), the erased model 
consists in first obtaining a multigraph according to the directed configuration model and then erase all 
self-loops and merge multiple edges in the same direction into a single edge, with the result being a simple 
graph. Note that the graph obtained through this process no longer has (m, d) as its bi-degree-sequence. 

As for the repeated model, let (N( e ),D( e )) be the bi-de gree-sequence of the simple graph obtained through 
the erased directed configuration model with bi-degree-sequence (N, D). Define the joint distribution 

h^\i,j) = -J2 p ( N t ] = *. D k ] = 3) i,j = 0,1,2,..., 

71 k=l 
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and the empirical distributions, 

t \ 1 " 1 " 

jr = -J2 l(^i e) = k) and = i ^ 1(15 e) = fc) fc = 0, 1, 2, . . . . 

i=l i=l 

The following result is the analogue of Proposition 13.71 for the erased model. 

Proposition 3.9. For the erased directed configuration model with bi- degree- sequence (N,D), as constructed 
in Subsection \2.1\ we have: 

a) h^ n '(i,j) — > figj as n — > oo, i, j — 0, 1, 2, . . . , and 

b) for all k = 0,1,2,... , 

Jk — > Jk and g k y — > gk, n -> oo. 



4. Proofs 



In this section we give the proofs of all the results in the paper. We divide the proofs into two subsections, 
one containing those belonging to Section [5] and those belonging to Section [31 Throughout the remainder 
of the paper we use the following notation: g(x) ~ f{ x ) if lim x -»oo g(x)/f{x) = 1, g(x) = 0(f(x)) if 
limsup x _ >00 g(x)/f(x) < oo, and g(x) = o(f(x)) if lim^oo g(x)/f(x) = 0. 



4-1. Degree Sequences 

This subsection contains the proofs of Lemma 12. 71 Theorems 12.91 and 12.101 and Proposition 12. Ill 

Proof of Lerama \2. 7[ Let Zi = "fi — £j and note that the {Z{\ arc i.i.d. mean zero random variables. If 
E[Zf] < oo, then Chebyshev's inequality gives 



P(V c n ) = P 



> n 



as n — > oo. 



Suppose now that E[Z\] = oo, which implies that K = 1 -max{a 1 ,/3 1 } € (0, 1/2]. Let 9 = max{a 1 ,/3 1 }, 
define t n = n e+e , < e < min{5o,0 _1 — and let {Zi] be a sequence of i.i.d. random variables having 
distribution P(Z 1 < x) = P(Z 1 < x\\Z x \ < t n ). Then, 



P 



n 



> n 



1 — K+<5 



> n 



1 — k+Sq 



P(\Zi\ <t n ) n + P 



> n 



1— /t+5o 



, max \ZA > t, 

KKn 



< P 



! = 1 



1— re+5 



P I max \Zi\ > t 7 

. Ki<n 
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By the union bound, 



P max \Zi\ > in < nP(\Zi\ > t n ) < 7iP( 7l + £1 > t n ) < nF( 7l > t n /2) + nP(fr > t n /2) 

\l<i<n J 

< n(t n /2)- a L F (t n /2) + n(t n /2)-?L G (t n /2) 

= O (n^+')L F (t n ) + n'-^L G {t n )) 

= O (n- ae L F (t n ) + n- pe L G (t n )) 

asn-> oo, which converges to zero by basic properties of slowly varying functions (see, e.g., Proposition 1.3.6 
in Q). Next, note that since E[Z{\ = 0, 

To estimate the integral note that 

P{\Z X \ > z)dz < / (P( 7l > z/2) + P(6 > z/2))dz 

/•OO 

< 2 / (u""L F (u) + u~ p L G {u)) du 

Jt n /2 

~ 2 ((a - l)- 1 (i„/2)- tt+1 L F (t„/2) + 08 - l)- 1 (t„/2)-^ +1 L G (t„/2)) 

where in the third step we used Proposition 1.5.10 in jij. Now note that 

min{(a - 1)(0 + e), (/3 - + e)} = - + e) = k + - 1), 

from where it follows that 

|£[Zi]| = O (n-"- 6 ^" 1 - 1 )^^) + L G (t»))) - o (n-"+*>) 
as n — > oo. In view of this, we can use Chebyshev's inequality to obtain 



P 



^ - nP[Zi] 



•n|£J[Zi]| > ?i 



1 — K + <5n 



< 



Var(Zi 



n l-2(*-«o)(l + o(l))' 



(4.1) 



Finally, to see that this last bound converges to zero note that 



Var(Zi) < E\Z\\ = 



1 



-E\Z\\(\Z y \ < tn)] < (l + o(l))£ |Zi| 



P(|^l| <*n) 

where E[\Zi\ e 1_e ] < oo by the remark following (|2.ip . We conclude that (|4.1j) is of order 

O (^"^n 2 ^- 5 ")- 1 ) = O ( n (^)(2-e- 1 + e) +2(«-5 )-A = Q f n -2(So-e)\ = o(1) 

as n — » oo. This completes the proof. 

Before giving the proof of Theorem 12.91 we will need the following preliminary lemma. 



□ 



Lemma 4.1. Let {Xi,...,X n } be an i.i.d. sequence of nonnegative random variables having distribution 
function V , and let denote the ith order statistic. Then, for any k < n, 



=n-fc+l 





<-j 




Jo 



hn {ny(a;), fc} dx. 
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Proof. Note that 



E 



from where it follows 



X 



- J P{X (l) >x)dx = J i^.jV(xyv(x) n - J dx 7 



j=n-i+l 



J2 e[x®]= Yl £ ( ■) / ^)M*r-^ 

i=n-fc+l i=ri-fc+l j=n-i+l ™ ' ^ 

J3 [min{S(n,F(x)),fc}] dx, 

where B(n,p) is a Binomial(n,p) random variable. Since the function u(t) = min{t, k} is concave, Jensen's 
inequality gives 



E [mm{B(n,V(x)) 7 k}] <nan{E[B{n,V{x))],k} = min {nV{x), k) . 



□ 



Proof of Theorem ] 2. 9[ Since by construction Yl7=i ^ = Yl7=i ^ follows from Theorem l2 . 81 that it suffices 
to show that 



lim P max V TV, - V min{A, \A - {vM} I > I = 0. 

n— too \ ACV \ z — ' * — ' / / 

\ ~ \vi£A i=l I I 

Fix < e < min{/3 — 1, a — 1, 1/2} and use the union bound to obtain 
P (max ( £ -£min{A, |A- ] > ) 



MiiGA i=l 



+ P (xcv,,J5^(^ (E^-E^{A,|A-M|}) >o) . (4.3) 
By conditioning on how many of the Di are larger than n^ 1+ ^^ we obtain that ()4.3|) is bounded by 



P { max [ V Ni - Vmin{A, 1-4 - ) > 0, max D l < 

\ ACV,\A\>n(i+*)/f> \ ^ ^ lOllj , 1<i<n t- 



,(l+e)//3 



MJiGA i=l 



P [ max Di > n (1+e)//3 

, KKn 



Vu;GA i=l 

z? 



< P max V iVi - V A > + P max D: > n^+'W 

\A<ZV,\A\>ni l + ')/f> ^ I 

= P ( max (e t + Xi) > « (1+e)//? 
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where T> n was defined in Lemma 12.71 Now note that by the union bound we have 



P max (& + Xi) > n 

Ki<n 



< — ^— • P ( max (6 + v 4 ) > n (1+£ )/ /3 
7 ~ P(D„) V 1 ^™ 



^P(k)-(" <1+,w - 1 )^ i °(" (1+,vs - 1 ) 

= 0(V e L G (n< 1+e >/")) =o(l), 

asn-> oo, where the last step follows from Lemma 12.71 and basic properties of slowly varying functions (see, 
e.g., Chapter 1 in Q). 

Next, to analyze (|4.2|) let k n = [n' 1+c '/ ,3 J and note that we can write it as 

p Lc^< kn ( i> - e min{A ' |a - > °) 

< P | max I max f N,-Y^ min{A, 1} I , max f Nj - mini A, |{w 7 -} - I > > I 

" ^ \ACV.2<\A\< kn \^ A ^ j l<i<n\> ^ J) J 

= P max <^ iV (l) , (iV + L>) ( "H -^min{A,l} > , 

\ U=n-fe„+l J i=l / 

where is the «th smallest of {xi, . . . , x n }. Now let ao = P[min{£i,l}] = G(0) > and split the last 
probability as follows 

P max <^ {N + D) (n) \ -^min{A,l} >0 

\ U=n-fe„ + l J i=l ) 

< P max min{A, 1} _ aon — n 

l/2+e (44) 

\ U=n-fe„+l J i=l / 

+ P ( ^min{A, 1} < am - n 1/2+e \ . (4.5) 



,i=l 



To bound (|4. 5[) use A > & for all i = 1, . . . , n and Chebyshev's inequality to obtain 

P (j>in{A, 1} < aon - n 1/2+e J < ^^y^ (l>° - min{&, 1}) > n 1 / 2 ^ 

nVar(min{£i,l}) _ , _ 2e , 
S P(A> 1+2£ 1 j ' 

while the union bound gives that (|4.4j) is bounded by 

P (max J # (<) > ( N + D ) {n) \>bn)<P\ J2 Nit) > b n) +P ((N + P>) (n) > 6„) , 
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where b n = a^n — n l / 2+e . For the second probability the union bound again gives 
P ({N + D)W > & n ) < P > bn/2) + P (D^ > 6„/2) 

< j^-j (P(li + n > b n /2) + P(& + xi > b n /2)) 

< {(bn/2 - l)- a L F (b n /2 - 1) + (b n /2 - l)^L G (b n /2 - 1)) 
= O (n- a+1 L F (n) + n- fj+1 L G {n)) = o(l) 

as n — > oo. Finally, by Markov's inequality and Lemma 14.11 

/ n \ 1 n 1 71 



V£= n— fe n +1 / n i—n—i 

l 



b n P{V n ) 
{nF(x), k n } dx + k n 



~ b n P(V n ) 

= ao 1 (1 + o(l)) jf min n( 1+£ '/' 3 " 1 } da; + o(l) 

< o^l + o(l)) ^ (1+e)//3 " 1 + J™ min {^- a+e , n^/^" 1 } 
= o(l) + O min [x~ a+ \ 7i (1+e)/ ' 3 - 1 } <fc) 

asn-> oo, where = sup t>1 t~~ e LF(t) < oo. Since 



mmtx-^^n^/^ldx = n (l+e)/f»-i( n (/»-i- e )/(/»(a-<)) - 1) + / aT^d* 

= O r n -W-l-«)(a-l-e)/W(a- e ))^ = o(1)) 

the proof is complete. □ 
The last two proofs of this section are those of Theorem 12.101 and Proposition 12. Ill 

Proof of Theorem \2.1(A Let u : N r+S — > [—A/, M], M > 0, be a continuous bounded function, and let A n , £>„ 
be defined as in Lemma 12.71 Then, 



\E [u(N n , . . . , N ir , D n , . . . , Dj s )] - E [u( m , ...., h .C: &)] | 

< |^[w( 7n +r n ,..., 7li . +Ti r ,^j 1 +Xj 1 ,---,^ 7s + Xj.) - u(lii, ■ ■ ■ ,1i r ,£ji, ■ ■ ■ ^jJl^nW (4.6) 
+ |£ [«( 7<1J . . . ,7^,^, ■ • .,&.)|Z>n] - ^ K 7 i, ...,7r,a,-- • ,6)]| • (4-7) 

Let T = Y] t —i i~i t + Y] t —i Xjs- Since u is bounded then f)4.6[) is smaller than or equal to 

E[\u(-f tl +-r il ,...,7i r + n r ,^ jl +Xji, ■ • •>&. + XjJ - "(7ii,- • • •>&,)! 1 ( T > 1)1 ^n] 

< 2MP(T > l\V n ) < 2M \Y,P(t h = l\V n ) + Y^P{ Xk = 1|2?„) ) 



vt=l t=l 



2M 
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To compute the last expectations let T n = (7(71, . . . , 7„, £1, . . . , £„) and note that 



E[l( Xjt = l,X> n )] = S[l(O ra ) J B[l( Xit = 1)1 j- n ]] = s 
and symmetrically, 

£[l(r it = 1,£>„)] = £ 
from where it follows that (|4.6|) is bounded by 

|A„ 



1(P„,A„ >0) 



(a":-i) 

(a„) 



= E 



l(2? n ,A„ >0) — 
n 



!(£>„, A„ <0) 



I A, 



2M 



— 1(A„ > 0) 

n 



E^ 



1(A„ < 0) 



< 2M(r + s)n 



— K+S 



0(1) 



as n — > 00. To analyze (|4.7p we first note that by Lemma |2"77I P(T> n ) — s- 1 as n — > 00 , hence 

f 



£[u(7i 1 ,...,7*ri0i)---»^jj|^n 



-£K7 1 ,... )7rj a i ...,6)lW] 
£K7i r .., 7r ,6,...,6)W]+o(l). 



Therefore, (|4. 7[) is equal to 

|B [u(7i, • • ■ , 7r, 6, • ■ • , + o(l)| < MP(££) + 0(1) -> 

as n — > 00, which completes the proof. 

Proof of Proposition [X771 Fix e > and let V n = {|A„| < n 1_K+l50 }. For the first limit fix i, j = 0, 1, 2, . 
and note that by the union bound, 



□ 



P 



< P 



( \ Y^{N k =i, D k =j)-f igj >ej 

\ k=l 
I 1 " 

p [ -Z) 1 (7fc = <,e*=i)-/iflj 

V n fc=i 

\ fc=i 



+ T fc = i, Cfc + Xfc = j) - 1(7* = £fc = .?')) 



> e/2 



P(V n )n(e/2)- 



k + r fe = i, £ fc + Xfc = j) - l(7fc = h 6 = j))\ > e/2 



Var(l( 7l =i,a = j)), 



where in the last step we used Chebyshev's inequality. Clearly, Var(l(7i = = j)) = figj(l — fi9j), and 
since by Lemma [2.71 P(T>„) — > 1 as n — > 00, then the second term converges to zero. To analyze the first 
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term note that at most one of \k or can be one, hence, 



P [\ E l 1 ^ + T k =i,Hk + Xk= J) - 1(7* = *,& = i))l > e/2 ©nj 

< W - Efl 1 ^ 

\ fc=i 

< p (^Ew» = 1 ) + 1 ( rfe = 1 )) >e / 2 p ») 



k + Xk = j) - 1(& = i)| + |l(7fc + 7* = *) - 1(7* =*")!)> c/2 



p 



> e/2 



V, 



< lfa - "** > e/2) -> 

as n — > co. 

Next, for the average degrees we have 



P 



( 1 ™ 



> e 



P 



< P 



< 



( 1 n 

f -E7i"^7i] +^ 
\ i=i 

1 / 1 " 



> e 



> e 



-K. + &0 



> e 



symmetrically, 



P 



(|lg Ci _ E[{l] | >e )<_i_ P (|lg 5( _ EBl] 



and since r^Xi = for all 1 < i < n 
P 



> e 



< P 



< 



/ n 

i ( i n 

/ 1 " 



rr K+s > e 



(4.8) 



(4.9) 



(4.10) 
(4.11) 



for any So < S < n. By Lemma 12. 71 P(D n ) converges to one, and by the Weak Law of Large Numbers 
(WLLN) we have that each of (|4.8[) . (|4.9[) and (|4. 10[) converges to zero as n — > oo, as required. To see that 
(|4.11[) converges to zero use Markov's inequality to obtain 



p [k±™ 

\ i=l 



i + 7»X») > " 



s[ n ei+7ixipn] £[(n£i+7iXi)iM 



-K+<5 



P(X>„)n^ K+ ' 5 



(4.12) 
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Now let T n = £7(71, . . . ,7„,£i, . . . , £„) to compute 

E[(nti+'nxi)l(pn)] = E[(^E[ Tl \^ n } + ll E[xi\J r n ])l{V n )} < E 



(6+7i)^i(n0 

n 



< 2/in 



— k+So 



which implies that (|4. 1 2|) converges to zero. 

Finally, provided that E[yf + £f] < 00, the WLLN combined with the arguments used to bound (|4. give 



P 



and symmetrically, 



i t*?- m\ > «) < (|i gTf - E [7?]| + it(^ 



r 4 +rf) > e,2?„ 



<o(l) + 
<o(l) + 



E[(2 7l + l)n\V n ] 



g[g7i + 1] 
P(V n )n s - s ° 



0, 



as n -> 00. 



□ 



4- 2- Configuration Model 

This subsection contains the proofs of Proposition 13.31 which establishes the uniformity of simple graphs, 
Propositions l3.5l and l3.7l which concern the repeated directed configuration model, and Proposition ^. 91 which 
refers to the erased directed configuration model. 



Proof of Provosition \3JA Suppose m and d have equal sum l n , and number the inbound and outbound 
half-edges by 1, 2, . . . , l n . The process of matching half edges in the configuration model is equivalent to a 
permutation (p(l),p(2), . . . ,p(l n )) of the numbers (1,2,..., l n ) where we pair the ith inbound half-edge to 
the p(i)th outbound half-edge, with all l n \ permutations being equally likely. Note that different permutations 
can actually lead to the same graph, for example, if we switch the position of two outbound half-edges of 
the same node, so not all multigraphs have the same probability. Nevertheless, a simple graph can only 
be produced by 0™=i dilrriil different permutations; to see this note that for each node Vi, i = l,...,n, 
we can permute its rrii inbound half-edges and its di outbound half-edges without changing the graph. It 
follows that since the number of permutations leading to a simple graph is the same for all simple graphs, 
then conditional on the resulting graph being simple, it is uniformly chosen among all simple graphs having 
bi-dcgrce-scquencc (m, d). □ 

Next, we give the proofs of the results related to the repeated directed configuration model. Before proceeding 
with the proof of Proposition 13.51 we give the following preliminary lemma, which will be used to establish 
that under Condition 13.41 the maximum in- and out-degrces cannot grow too fast. 
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Lemma 4.2. Let {a nk ■ 1 < k < n,n € N} be a triangular array of nonnegative integers, and suppose there 
exist nonnegative numbers {pj : j £ N U {0}} such that 'Y^jLoPj — 1, 

^ n n oo 

lim — y l(a n k = j) = Pu for all j S NU {0} and lim — > a n k = } ]Pj < oo. 

fc=i fc=i j=o 

lim max = 0. 

n— >oo l<fc<n 71 



Then, 



Proof. Define 

F(x) = y^p,- and F n (x) = - \ l(a nk < x) 

j=0 k=l 

and note that F and F n are both distribution functions with support on the nonnegative integers. Define 
the pseudoinverse operator h (u) = infja; > : u < h(x)} and let 

X n = F~ 1 (U) and X = F^U), 

where U is a Uniform(0,l) random variable. It is easy to verify that X n and X have distributions F n and 
F, respectively. Furthermore, the assumptions imply that 

X n — > X a.s. 

as n — > oo and 

= £ J - £ X ( a "* = = - J2 3l{a nk = j) = - X anfc -> ^[X] 

3=0 fe=l fc=lj'=0 fc=l 

as n — > oo. where the exchange of sums is justified by Fubini's theorem. Now note that by Fatou's lemma, 
liminf E[X n l(X n < >/n)] > E [liminf X n l(X n < ^/n)\ = E\X\, 

which implies that 

lim E[X n l(X n > y/n)} = 0. 

n— yoo 

Finally, 



n oo 



E[X n l(X n >n)}= X J- X 1 ( a " fc = ^ = ~ X X -Afank = j) = ~ X a nfc!( a nfe > V™), 
j=Lv^J+i fc=1 fc=1 j=Lv^'J+i fe=1 

from where it follows that 

a„ fc l(a nfc > v 7 ^) 
lim max = U. 

n-s-oo l<fc<n n 

which in turn implies that 

.. a nk ( Vn anfcl(a«fc > v / ")\ 
lim max < lim V max | = I). 

n— >oo l<fc<n 71 n— >oo \ 71 l<fc<ri ?1 



□ 
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Proof of Provosition Following the proof of Proposition 7.9 in [23|, we define the random variable M n 
to be the total number of pairs of multiple edges in the same direction, e.g., if from node Uj to node Vj 
there are k > 2 edges, their contribution to M n is (^j. Note that M n < M n , with strict inequality whenever 

there is at least one pair of nodes having three or more multiple edges in the same direction. Wc claim that 

~ p 

M n — M n — > as n — > oo, which implies that 

if (S n ,M n )=>(S,M), then (S n ,M n )=>(S,M) 

as n — > oo. To prove the claim start by defining indicator random variables for each of the possible self-loops 
and multiple edges in the same direction that the multigraph can have. For the self-loops we use the notation 
u = (r, t, i) to define 

/ u := 1 (self- loop from the rth outbound stub to the tth inbound stub of node vCj, 

and for the pairs of multiple edges in the same direction we use w = (r*i, t\, T2, t%, i, j) to define 

J w := l(r s th outbound stub of node Vi paired to t s th inbound stub of node Vj, s = 1, 2). 

The sets of possible vectors u and w are given by 

X = {(r,t, i) : 1 < i < n, 1 < r < d n i, 1 < t < m n i}, and 

J = {(ri,*i,r2,*2,i,j) : 1 < i ^ j < n, 1 < r x < r 2 < d ni ,l <h^t 2 < m n3 }, 

respectively. It follows from this notation that 

S n = X] /u and M n = ^ 



Next, note that by the union bound, 

P (Mn — M n > lj < P (at least two nodes with three or more edges in the same direction) 
< P (three or more edges from node Vi to node vj) 

dni{d„i - l)(d ni - 2)m nj (m nj - l)(m nj - 2) 



l<i^j<n 
1 



l n (l n -l)(ln-2) 



< 



max d n 

Ifl l<i<n 



In i<j<n 



l n — 2 / n 



1 n i n 



"'7 



o(l) 



as n — > oo, where for the last step we used Condition 13.41 and Lemma I4T2I It follows that M n — M n — > as 
claimed. 

We now proceed to prove that (S n , M n ) ^> (5, M), where S and M are independent Poisson random variables 
with means Ai and A2, respectively. To do this we use Theorem 2.6 in [23j which says that if for any p,?eN 



lim E 



(S n ) p (M n ) q — AfAj, 



where (X) r = X(X — 1) ■ ■ • (X — r + 1), then (S n , M n ) =>■ (S, M) as n —> 00. To compute the expectation 
we use Theorem 2.7 in 23[, which gives 



E 



(S n ) p (M n ), 



ui,...,u p gX wi,...,w,6j 



(4.13) 



N. Chen and M. Olv era- Cravioto/ 'Directed Random Graphs 



20 



where the sums are taken over all the p-permutations, respectively q-permutations, of the distinct indices in 
I, respectively J. 

Next, by the fact that all stubs are uniformly paired, we have that 

1 



P (^ui — • • • — Ai p — J-wi — ■ ■ ■ — J\v q — l) , — 



unless there is a conflict in the attachment rules, i.e., one stub is required to pair with two or more different 
stubs within the indices {ui, . . . , u p } and {wi, . . . , w g }, in which case 

P (J U1 = ■ • • = /«, = Jwi = ■ ■ • = Jw, = 1) = 0. (4.14) 
Therefore, from (|4.13[) we obtain 



E[(S n ) p (M n ) q ] < E 



r-rp+2q-l (l _ 
ui u,£Twi vr q £j lli=0 \ ln l ) 

\1\(\2\ -!)■■■ gg] -p + l)\J\(\J\ -!)■■■ (\J\ g + 1) 

ln{ln-l)---{ln-(p + 2q-l)) 

where |^4| denotes the cardinality of set A. Now note that 

n 

\1\ = ^m ni d ni , and 



2 

By Lemma 14.21 and Condition 13.41 we have 



irinj{"hij - 

\ ( n \ 1 n 

l^ni^^ni l)^m(<^m !)• 



\J\= 2^ 2 m nj {m nj -l) 

l<i^j<n 



2 ( E 77! ™( TO ™ i _ 1) ) ( E dm(d m - 1) j - x 

\i=l / \i=l / i=l 



^m ni (m„i - l)d m (d m - 1) < ( max m„* ] ( max o(n 2 ) 

i=i V 1 - 1 -™ / V / i=1 



as n — > oo. Hence, it follows from Condition 13.41 that 

H=25[ 7 f] + (1), 

n 



^ = -£[ 7 ( 7 -1)] +0 (1), and 



\J[ = 1 
n 2 2 
n 1 

- = - + o(l) 



In A* 

as 7i — ^ oo. Since p and <? remain fixed as n — > oo, we have 

]imswpE[(S n ) p (M n ) q ] = f lim J^V f lim ^JY f lim ^ 



p / g[ 7 (7-i)]^K«-i)] y fiy +29 = A?A , 



(4.15) 
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To prove the matching lower bound, we note that (|4. 14[) occurs exactly when there is a conflict in the 
attachment rules. Each time a conflict happens, the numerator of (|4.15[) decreases by one. Therefore, 

mem -i)--- (m -p+ ww\ -!)■•• qj\ - ?+ 1) 



E 



(S n )p(M n ) q 



l n (l n -l)---(l n -(p + 2q-l)) 

l(ui, . . . , Up, wi, ...,w q have a conflict) 



ui,...,u !> eiwi w,ej Hi=o v 

= A i A 2 _ 7 — \p-+2q- X! 51 1 (ui,..., Up, wi,...,w g have a conflict) + o(l) 



ui,...,u p 6l wi,...,w ? GJ7 

as n — > 00. To bound the total number of conflicts note that there are three possibilities: 

a) a stub is assigned to two different self-loops, or 

b) a stub is assigned to a self-loop and a multiple edge, or 

c) a stub is assigned to two different multiple edges. 

We now discuss each of the cases separately. For conflicts of type (a) suppose there is a conflict between the 
self- loops u a and U&; the remaining p — 2 self- loops and q pairs of multiple edges can be chosen freely. Then 
the number of such conflicts is bounded by |I| p_2 | l 7| IJ = O (n p+2?_2 ), hence it suffices to show that the total 
number of conflicting pairs (u a ,U{,) is o(n 2 ) as n — > 00. Now, to see that this is indeed the case, first choose 
the node Vi where the conflicting pair is; if the conflict is that an outbound stub is assigned to two different 
inbound stubs then we can choose the problematic outbound stub in d n i ways and the two inbound stubs in 
m„i(m m — 1) ways, whereas if the conflict is that an inbound stub is assigned to two different outbound stubs 
then we can choose the problematic inbound stub in m ni ways and the two outbound stubs in d n i(d n i — I) 
ways. Thus, the total number of conflicting pairs is bounded by 



E(d nl m 2 ni + m ni d 2 nj ) < ( max m ni + max d ni I 2^ m ni d m = o(n 3/2 ) = o(n 2 ) 
V Ki<n Ki<n > 



= 1 



For conflicts of type (b) suppose there is a conflict between the self-loop u a and the pair of multiple edges 
W;,; choose the remaining p—1 self- loops and q — 1 multiple edges freely. Then, the number of such conflicts is 
bounded by |I| p_1 | l 7| l?_1 = O (n p+2<?_3 ), and it suffices to show that the number of conflicting pairs (u ,wj) 
is o(n ) as n — y 00. Similarly as in case (a), an outbound stub of node Vi can be paired to a self-loop and 
a multiple edge to node Vj in d n ii7i n ifn n j (d n i - - 1) ways, and an inbound stub of node Vi can be 

paired to a self-loop and a multiple edge from node Vj in m n id n id n j(m n i — l)(d n j — 1 ) ways, and so the total 
number of conflicting pairs is bounded by 




X^ d ™ TOmTO »J + m2 m d ^ d lj) < ( 1 m ^ x „ TO ™ + d ™ ) 2 ( X] m ™ ) ( ) = °( n5/2 ) = °( n3 )- 

i=l j=l 



Finally, for conflicts of type (c) we first fix w and wj and choose freely the remaining p self-loops and q — 2 
multiple edges, which can be done in less than |X| P | J\ q ~ 2 = O (n p+29_4 ) ways. It then suffices to show that 
the number of conflicting pairs (w a , w&) is o(n ) as n — > 00. A similar reasoning to that used in the previous 
cases gives that the total number of conflicting pairs is bounded by 

n n n 

2 J2Y ^( d m m nj TO nfc + m li d lj d lk) 
i=l j=l fe=l 



< 2 
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We conclude that in any of the three cases the number of conflicts is negligible, which completes the proof. □ 



Proof of Proposition \3.7\ Let S n be the event that the resulting graph is simple, and note that the bi-dcgrce- 
sequence (N( r ', D^)) is the same as (N,D) given S n - 

To prove part (a) note that for any i, j = 0, 1,2, . . . , 

n 

h (n) (i, j) = - P ( N * =*>Afe = j\S n ) = 7J7^T P (^i = h Di = j,Sn), 

since the {(Nk, ffc)}I!_i are identically distributed. Now let Q n = o~(Ni, . . . , N n , D\, . . . , D n ) and condition 
on Q n to obtain 

P(N 1 =i,D 1 =j,S n )=E[l(N 1 = i,D 1 =j)P(S n \g n )], 

from where it follows that 

E[l(Nx =i,Di= j)(P(S n \g n ) - P(S n ))} 



< 



< E 



P{S n ) 



+ \P(N 1 =i,D 1 =j)-f i g j \ 



P(S n \Gn) 

P(S n ) 



+ \P(N 1 =i,D 1 =j)-f i g J \. 



Theorem 12.101 gives that the second term converges to zero, and for the first term use Thcorcm l3.6l to obtain 
that both P(S n ) and P{S n \Q n ) converge to the same positive limit, so by dominated convergence, 



lim E 



P(S n \Gn) 

P(S n ) 



< E 



lim 



P(S n \Gn) 



P{S n ) 



- 1 



= 0. 



For part (b) we only show the proof for since the proof for fk ' is symmetrical. Note that g~k (n> is a 

quantity defined on S n . Fix e > and use the union bound to obtain 



(«) . 



:(«) 



P 



~ (") 

9k - 9k 



> e 



Sn) < 



1 / 1 ™ 



9k 



> e 



+ Xi = k)-m = k)\>e/2 



P(S n )P{V ri 



-P 



( 5 ± !*-*>- 
\ 1=1 



9k 



> e/2 



(4.16) 



(4.17) 



By Theorcm l3.6l and Lemma [2~71 P(S n ) and P(T> n ) are bounded away from zero, so we only need to show that 
the numerators converge to zero. The arguments are the same as those used in the proof of Proposition ^ . 1 ll 
for (|4.17p use Chebyshev's inequality to obtain that 



P 



9k 



/ n(e/2y 



as n —> oo, and for (|4.16j) 



P U i 11(6 + Xl = k) ~ ^ = fc)l > 6/2 V ") ~ P ( 1(Xl = 1} > 6/2 Vn ) 



< P 



> e/2 



V n ) < l(n- K+So > e/2), 



which also converges to zero. This completes the proof. 



□ 
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Finally, the last result of the paper, which refers to the erased directed configuration model, is given below. 
Since the technical part of the proof is to show that the probability that no in-dcgrccs or out-degrees of a 
fixed node are removed during the erasing procedure, we split the proof of Proposition 13.91 into two parts. 
The following lemma contains the more delicate step. 

Lemma 4.3. Consider the graph obtained through the erased directed configuration model using as bi-degree- 
sequence (N,D) , as constructed in Subsection \2.1{ Let E + and E~ be the number of inbound stubs and 
outbound stubs, respectively, that have been removed from node v\ during the erasing procedure. Then, 



lim P(E + = 0) = 1 



and 



lim P(E~ = 0) = 1. 



Proof. We only show the result for E + since the proof for E is symmetric. Define the set 

V+ = {{t!, ...,it):2<i 1 ^i 2 ---^i t <n,l<t<n}, 

and note that in order for all the inbound stubs of node v\ to survive the erasing procedure, it must have 
been that they were paired to outbound stubs of iVi different nodes from {v2, ■ ■ ■ , v n }. Before we proceed it 
is helpful to recall some definitions from Section[2l L n = Y^i=i = Yli=i Di, r„ = Y^i=i 7*i "™ = £i> 
A n = r„ — S„, and T> n = {|A„| < n 8 }, where s = 1 — n + S ; also, {7^} and are independent 
sequences of i.i.d. random variables having distributions F and G, respectively. Now fix < f < 1 - s and 
let Q n = a{N ll N n ,D lt . . . , D n ). Then, since D { = & + Xi > £i, 

P(E+=0)=E [P (E+ = 0| Q n )] >E[P (E+ = 0| g n ) 1(1 < JVi < n e )] + P{N ± = 0) 

1(1 < M < n e ) 



= E 



> E 



1(1 < 71 +n < n e ) 



D il D ia ---D ilfi (L n -N 1 )\ 



+ P(Ni = 0) 



> E 



+ P(Ni = 0) 

1(1 < 71 < ™ £ )l(n = 0) 



^ ] Cil £«2 

(ii,i 2 ,...,ifr 1+ T 1 - ) )EV£ 



■ ( L n ~ 71 -Tl)! 



{Ln) 



Ti 



(tl,»2i...,t Tl )e7'n 



+ P(N l =0). 



(4.18) 



Next, condition on T n = c(7i, ■ ■ ■ , 7n> £ii 

P(n = 0| T n ) = 1 (A„ > 0) 



£ n ) and note that 

r„ 



r„ + |A„ 



1(A„ < 0) > 



r„ 



It follows that the expectation in (|4.18[) is equal to 

l(l< 7l <n £ ) 



E 



p{T 1 = Q\T n y 



> E 



> E 



1(1 <7i <» e ) 
(r„ + |A„|)-n 



(i 1 ,i 2 ,---,i T1 )e'P+ 



E 



Cii £»2 ' ' ' Si-y! 



(ti,»2, 



1(1 < 7i < » £ )r, 



E 



Cil Ci2 ' ' ' ^71 



(ii,i2,—,i~ n )&'Pn 
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1 



PiP n ) 
1 

P(D n ) 



E 



E 



l(l<7i<n e ) E 
1(1 < 7i < <b . E 



i(x> n )r n 

(r„ + n s )Ti+ 
l(V n )T n n^ 



(n - 1 - 7i)!nTi [ (r„ + n s )Ti +1 

It follows by Fatou's lemma, Lemma [2771 and Theorem 12.101 that 

l(X> n )r n nT» 



X ' S»l SJ2 ' ' ' 6y] 

66 • • • £71 



liminf P(P+ = 0) > E 



1(71 > 1) liminf E 



66" "6 



7i 



Next, define the function u+ : N — > [0, 00) as 

"l(|r n _i + 1 - H n | < n 5 )^! + i)n* 



u+(t) = E 



(r n _i + t + 



£16 ■■•6 



7i 



71 



+ P( 7 i =0). 



and note that it only remains to prove that for all t £ N, liminf n ^oo u^(t) = 1 
Now let < a < fi and note that 

"l(|r„_i +t-En\ < n s ) 



u+(t) > E 



E 



l(r„_i > an) 



66 
(r„_! + i)n* 



P(r„_x < an) 



1 



(r n _i +t + n s )*+ 1 fi* 

The SLLN and bounded convergence give lim n _ J . 00 P(T n —i < an) = and 



lim sup E 



l(r„_i > an) 



(r n _!+t)n* 



< E 

from where it follows that 



66 • • -6 lim sup 



lim inf u+ (t) > lim inf P 



(T n _i + t + n s ) t + 1 jU* 

(r n _i + tK 1 

(r n _i + i + n s )'+ 1 



l(|r w _i+t-5 n | <n s ) 



66 ■••6 

66 ■••6 

= 0, 

66 



The last step is to condition on 6 > 6 • • • j £t an( i use Fatou's Lemma again to obtain 

'l(|r„_i+t-H„| <n s ) 



lim inf E 

n— f 00 



lim inf E 

n— yoo 



66---6 



66 •••6 



p(|r n _! + t-H n | <n s |6,...,6) 



> p 



^ l6 ' t " 6 liminf P(|r n _x + 4 - H„| < n s |6, ■ ■ • At) 



Finally, by the same reasoning used in the proof of Lemma 12.71 we obtain 

lim P(|r„_! + t - E n \ < n s |6, ..-,&) = 1 a.s. 

n— >-oo 

Since P[66 1 ' ' it]/ ^ — 1, this completes the proof. 



□ 
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Proof of Proposition \3.9l To prove part (a) note that since the {(Ni e \-D- e) )}™=i are identically distributed, 
then h( n \i,j) = P(iV 1 (e) = i,D[ e) = j). It follows that 

h {n) (t, j) - fiQj < P(N[ e) = i, D[ e} = j) - Pim =i,D 1= j) + |P(JVj = i, Dy = j) - f i9j \ . 



By Theorem 1 2 . 1 01 we have that \P(Ni = i, D\ = j) — f%gj\ — > 0, as n — > oo, and for the remaining term note 
that 



P{N[ e) = i, Df> = j) - P{Ni =i,D 1= j) 



,(e) 



< E 

< E 



l(Nt } =i,D[ e) =j)-l{Ny = i,D 1 =j) 



l(D[ e) =j)-1(D 1 =j) 



E 



l(N[ e) = i) - l(Nx = i) 



(4.19) 



To bound the expectations in (|4. 1 9f) let E + and E be the number of inbound stubs and outbound stubs, 
respectively, that have been removed from node v\ during the erasing procedure. Then, 



E 
E 



\{D^ = ] )-\(D 1 = ] ) 
l{N[ e) =i)-l(N 1 = i) 



By Lemma T4. 31 



lim P(E~ > 1) = 

n— too 

which completes the proof of part (a). 



and 



< P (E~ > 1) and 

< P (E+ > 1) . 

lim P(E+ > 1) = 0, 

n—¥oo 

■in) . 



For part (b) we only show the proof for g~k , since the proof for f k is symmetrical. Fix e > and use the 
triangle inequality and the union bound to obtain 



P (\9k(k) ~g k \>e)<P^ g k {k) - i ^ = k ) > e / 2 ) + P J2 *(A = k ) 



9k 



> e/2 



From the proof of Proposition 13. 7\ we know that the second probability converges to zero as n — > oo, and 
for the first one use Markov's inequality to obtain 

P ^ g k {k) - I X (A = k) > e/2j < P ^ £ |l(£>< e) = k) - 1(A = k)\ > 

< ^E [|l(Z?J e) = k) - l(Di = k) 



c 

< ^P{E~ > 1) -> 0, 



as ?i — > oo, by Lemma |4. 3 



□ 
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